Experience Replay is widely used for off-policy reinforcement learning. With cpprb, you can start your experiment quickly without implementing troublesome replay buffer.
Heavy calculation is implemented with C++ and Cython. cpprb is usually faster than Python naive implementation.
cpprb supports Ape-X on single computer. You don’t need to think problematic lock. cpprb locks only critical section internally well.
cpprb adopts flexible environment. Any numbers of Numpy compatible environment values can be stored.